Iconic representation of content

ABSTRACT

A method and apparatus for determining and displaying icons representing files containing text, such as e-mail, web pages, text documents, word-processor documents, etc. In particular, the system determines the content of the text by examining words in the document. For example, if words relating to cars appear several times in the document, then the document&#39;s topic probably relates to car. Next, the system searches in a database of icons, which are labeled according to type. For example, the database may contain graphics relating to transportation (cars, planes, trains, etc.) computers (hard disk, monitor, keyboard, etc.), animals (mammal, reptile, amphibian), and many other categories. The system chooses the closest icon available and displays it as the icon representing the text document. (For example, the system may associate the document on cars with a car icon, and the car icon is displayed in appropriate regions of the desktop such as in file listings, desktop shortcuts, menus task bars, etc.)

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] This invention generally relates to computer user interfaces.More specifically, the invention relates to a method and system forimproving the searching of computer files via representation of theircontent as icons.

[0003] 2. Prior Art

[0004] Many functions in modern computers can be very time consuming.From having to wait to turn on the computer, waiting for all programs toload, and then finally having to wait to determine what each filecontains. For example, a user is not familiar with a certain computermay want to find a file about car mechanics. That person would probablyhave to go to a separate directory, such as DOS, Windows Explorer, etc.,in order to be able to find a certain file. After reaching the separatedirectory, the user would have to specify a search on all the files in adrive. After this long search a user would have to go through each fileand read about the file and then sooner or later find the file they werelooking for.

[0005] Another method to solve this problem is to conduct a basic searchfor file names on the computer's operating system. After conducting thesearch, the user would have to browse through different files and openevery file separately and check to determine if the file is the oneneeded.

[0006] This can be a very long and useless process because of the amountof work needed to open and search through drives and directories. Othersearches that compare keywords with text in documents can also befrustrating, especially for beginning users.

[0007] This process is also extremely lengthy and sometimes evenpointless when considering the number of files that could show up in onefile search. This method is very time consuming because of the fact thatbefore the user finds the file he/she is looking for, they may have togo through opening a large number of files.

[0008] Another method is to use the icons listed throughout the drivesand the desktop. By right clicking on an icon, the user can get a basicmenu. Typically, one of the options on the menu is “properties,” whichallows the user to view a small number of details about the file.

[0009] This process of trying to find a file is very unlikely to behelpful because it is almost a guess to as what files are the onesneeded. The user would have to go through a large amount of files beforehe/she finds the one needed. This method is also very time consumingbecause the user would have to go through a number of files and spend afew minutes looking over the details of the files. Another reason whythis method is not very helpful is because the details listed by theproperties function are not very informative about the file's content.

SUMMARY OF THE INVENTION

[0010] An object of this invention is to be able to provide a user,regardless of whether the user is familiar or not familiar with acomputer, with an easily accessible method to find programs or any filesthat he or she needs without taking up too much time or patience.

[0011] Another object of the present invention is to provide a versatilesystem for determining and displaying icons representing files orportions of files containing text such as e-mail, web pages, textdocuments, word-processor documents, etc.

[0012] A further object of the present invention is to determine thecontent of the text of a document by examining words in the document,and then choosing the closest icon available and displaying it as theicon representing the text document.

[0013] These and other objectives are attained with a method andapparatus for determining and displaying icons representing filescontaining text, such as e-mail, e-books, web pages, text documents,word-processor documents, etc. In particular, the system determines thecontent of the text by examining words in the document. For example, ifwords relating to cars appear several times in the document, then thedocument's topic probably relates to car. Next, the system searches in adatabase of icons, which are labeled according to type. For example, thedatabase may contain graphics relating to transportation (cars, planes,trains, etc.) computers (hard disk, monitor, keyboard, etc.), animals(mammal, reptile, amphibian), and many other categories. The systemchooses the closest icon available and displays it as the iconrepresenting the text document. (For example, the system may associatethe document on cars with a car icon, and the car icon is displayed inappropriate regions of the desktop such as in file listings, desktopshortcuts, menus, task bars, etc.)

[0014] One way for the system to select an appropriate icon is bycomparing a content word from a document, such as car, with the databaseof icons, which also contains words associated with each icon. As anexample, the database may contain records containing words and names oficon (graphical image) files:

[0015] Icon Database: Text Icon file name car car.jpg dog dog.jpgkeyboard keyboard.jpg amphibian frog.jpg frog frog.jpg

[0016] If the topic word is car, the system searches the text in theicon database for “car.” When there is a match, the system reads theicon file name car.jpg and displays the icon. The image car.jpg mayinclude an advertisement.

[0017] Various methods are available for determining the “content” of adocument, or of the sections of a document. Such methods include latentsemantic indexing, known to those skilled in the art of contentdetermination, and examination of words in titles and headings, and inthe body of a document. For example, if a chapter title in a documentcontains the word amphibian, the chapter likely is about amphibians, andan amphibian picture (e.g. frog.jpg) may be displayed.

[0018] As an extension of this basic principle, topic icons may bedetermined several times in a document. For example, the topic of oneparagraph may be cars and another paragraph topic might be trains. Theseicons may be displayed in the text document so that people can get anidea about content of a document with a quick glance. The icons may alsobe displayed outside the document so that users can get an idea as tothe nature and progression of sub topics in the document, and users mayeasily select sub topics by selecting the icons. For example, theoverall content of a document might be displayed for each file in aWindows Explorer listing of files. “Overall content” might be determinedby examining all the words in the entire document. Progressive content,represented by several icons, can be displayed next to paragraphsdisplayed in a display program (e.g. word processor, browser, etc.) oras a sequence of icons displayed elsewhere on the user's graphical userinterface.

[0019] This method provides a visual mechanism for locating andunderstanding the idea of a document, the location of files, in a userhard disk and their content.

[0020] This method can also help many people because very many computerusers often make simple mistakes that can take from a few minutes or toan hour to fix. By making this easy method of choosing a desired filewithout the need to check to see if it's the file needed (and then onlyto discover that it's the wrong file and then having to go searching forthe right file) the method is a very fast and effective way to operate auser's desktop.

[0021] The way this method of easy access to files is used is by firsthaving all the information about a file summed up and then put into asimple phrase that includes what the content of the file is. The icon ofthe file is also somehow relevant to what it contains. This is done sothe user can just glance at a file and be able to define what the filecontains and if it is what the user is looking for.

[0022] Further benefits and advantages of the invention will becomeapparent from a consideration of the following detailed description,given with reference to the accompanying drawings, which specify andshow preferred embodiments of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0023]FIG. 1 is a block diagram illustrating the present invention.

[0024]FIG. 2 describes the structure of a semantic content extractorthat may be used in the practice of this invention.

[0025]FIG. 3 illustrates the structure of an icon creator that may beused in this invention.

[0026]FIG. 4 shows how a person with a reading disability can use theicon system of this invention.

[0027]FIG. 5 gives an example of composite icons that represent multipletopics.

[0028]FIG. 6 is a flow chart illustrating a method embodying thisinvention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0029]FIG. 1 is a block diagram explaining the icon process. 108 is acomputer that represents a group of directories. 100 represents onedirectory in one location, 101 represents a second directory in a secondlocation, and 102 represents a third directory in a third location. Eachdirectory has a group of files listed as file 1, file 2, and so on. In acomputer, a module is running, 103, the Semantic Content Extractor. 103can exist within a user's computer, but in this drawing it exists in aserver connected to a network, 109. 103, has a running CPU, whichextracts the information and content from all the files, 100-103. 104 isresponsible for creating an icon using the information provided by theSemantic Content Extractor. Icons may include advertisements, forexample if the content is IBM computers, an ad for IBM computers may bepresented, and the ad may be a hyperlink to IBM's WEB page. These iconscan also be on a separate server as the Semantic Content Extractor. Inorder to create icons, 104, uses the database of icons, 106, which has athorough list of icons. The database of icons, 106, is connected to thenetwork, 109. The icons are created by creator of icons module 104. Inthe module 105, the index of icons to files (or parts of icons todifferent parts of a text) is created. This indexer module 105 can alsobe located on a server. The indexer creates an icon and attaches it to afile, 110 and 111.

[0030] There are numerous methods for extracting the content or topic oftext documents or portions of documents. These methods includeidentifying the number of times a particular word appears in a text orby latent semantic indexing as is known to those skilled in the art.

[0031]FIG. 2 describes the structure of the Semantic Content Extractor.This is responsible for being able to choose appropriate data to be ableto make an icon. 200 represent the input text in a file. 210 determinesthe size of the text. This can be done by checking the byte size of thefile. 201 counts words and characters that can be added up to create anapproximation for byte size. In order to speed up the counting process,a key word counter 202 can be created to count key words. Key words arewords that are essential to represent the meaning of the file. Key wordsdo not include words that are typical for any file (such as and, or,but, the, and so on.). 207 is a database of key words that is createdfrom other documents. 203 speeds up the keyword counting process bycounting key phrases used in the text. 205 represents the database ofkey phrases which holds all key phrases that were obtained from atraining database (or from processing textual files in past). 204produces LM from counts that were produced by counting modules 210, 201,202. The process of making language models (LMs) from counts isdescribed in the reference: Frederick Jelinek, “Statistical Methods forSpeech Recognition”, The MIT Press, Cambridge, Massachusetts 1998.

[0032]206 is the database of language models that were created fromdifferent texts (belonging to different topics—for each topic one LM ismade. For example, LM on a medical topic, LM related to travels etc.).220, is the topic identificator. It defines which language modelsprovide higher likelihood scores (or likelihood ratios) for input texts200. Since each LM is associated with a topic, it allows to classifyeach textual part with the topic.

[0033] If there are several parts in the text that are marked with thedifferent topics, it can be used to associate several topics with thetext and make a composite icon that points to different parts of thetext with different topics.

[0034] The method for classification and segmentation of a text bytopics using likelihood ration is described in the patent applicationSer. No. 09/124,075, for “Real Time Detection of Topical Changes andTopic Identification via Likelihood Based Methods”, filed on Jul. 29,1998.

[0035] This process will help create a composite icon, which will allowa better access. 208, the file topic divider, divides the files intotheir necessary parts and helps create an icon. 209 creates an index oficons to files or an index of parts of icons to different parts of thetext.

[0036]FIG. 3 illustrates the structure of the icon creator. 300 containstopics that were within the Semantic Content Extractor. Topics 1 through3 have weights listed under them. These weights stand for the importanceand significance of topics that are associated with a file. 301 is theintelligent matcher that creates a match of data and images to create anicon. This is done using the database of images 303 and the database oficons 304. The database of images is used only if there is no matchingicon for the data given. For example, if there were a topic concerning acar, the computer would search through the database of icons 304. If anicon were not found, one would be created using the database of images303. 302 extracts an icon that best fits the data given and then createsit to fit a desktop or directory. 306, according to the weight 305 of atopic, the icon combiner creates similar topic icons according to theirweight and content. 307, each icon has an index attachment. Thisattachment to the file opens directly to the file, thus creating easyaccess to any desired file. This method for opening a file is veryeffective. Although for blind people, another method of opening filescan be created. A blind person can use a sound icon using the databaseof sound icons 308. This would enable the blind user to use their senseof hearing to choose the files they wish to open.

[0037]FIG. 4 gives an illustration of how a person with a readingdisability can use this icon system. 400 is a group of files that areformed into an icon attachment 401. The user then chooses an icon 402,using the pictures or sounds, and the user can then use a speechsynthesizer 403 and can listen to a file.

[0038]FIG. 5 gives an example of composite icons that contain multipletopics. 500 shows an icon containing multiple topics, such as cars andtravel, 501 and 502, and dealerships 503. 501, the larger part of thefile shows cars, the smaller part of the file shows travel. Theintermediate sized part of the file shows dealerships. 503 contains anindex which lists information on cars or buildings 506. 504 shows wherethe information on cars is placed in the file. Using a fraction method,the files can be broken down, as shown in 504 and 505. 510 shows thefile.

[0039]FIG. 6 shows a flowchart of the method. At 600, a list of files isgenerated. Step 601 reads the content of each file, and at 602, topicsare attached to each file. At 603, icons are generated for files. At604, if several topics, a composite icon is created containing manytopics. At 607, an index of topics is created. At 605, a list of iconsis printed near file names. At 606, a list of icons can be created tolist files.

[0040] While it is apparent that the invention herein disclosed is wellcalculated to fulfill the objects stated above, it will be appreciatedthat numerous modifications and embodiments may be devised by thoseskilled in the art, and it is intended that the appended claims coverall such modifications and embodiments as fall within the true spiritand scope of the present invention.

1. A system for determining and displaying icons representing textfiles, comprising: a content extractor for determining the content ofall or parts of a text file by examining words in the file; a means forassociating the content with an icon; a selector for selecting an iconto represent the text file or portion of a file on the basis of thedetermined content of the text file; and a display for displaying theselected icons to represent the text file.
 2. A system according toclaim 1, wherein the selector includes means for selecting the closestone of a group of available icons to represent the text file.
 3. Asystem according to claim 1, wherein the content extractor includesmeans for determining several topic icons for the text file.
 4. A systemaccording to claim 3, wherein the topic icons form a composite iconassociated with a different parts of the text file.
 5. A systemaccording to claim 3, wherein the several icons are sensed by differentsenses.
 6. A system according to claim 1, wherein the icons facilitateuse of a computer by people with various disabilities.
 7. A system forrepresenting contents of computer files via icons, the systemcomprising: a computer memory including a group of directories withlists of files; a semantic content extractor for extracting informationand content from the files; and a module for creating icons representingthe files on the basis of the information and content extracted by thesemantic content extractor.
 8. A system according to claim 7, whereinthe semantic content extractor includes: a module that associates with atext file a language model, and word, key words and key phrases counts;a topic identifier that uses the language model and counts to identify atopic; and a module that partitions a text in a file by topic count. 9.A system according to claim 8, wherein the topic identifier useslikelihood ratio to partition texts in parts by topics; likelihood inthis ratio are defined by using probabilities of words from languagemodels of the text in a file and language models for various topics thatare stored in the database.
 10. A method for creating a composite iconto allow greater access to computer files, comprising the steps of:using a file topic identification to perform segmentation and topicclassification; and using a file topic divider to divide the files intoparts using segmentation and topic classification from the file topicidentification.
 11. An icon creator for creating an icon representing afile, comprising: a semantic content extractor for identifying theimportance and significance of topics associated with the file; and amatcher to create a match of data and images to create an icon using adatabase of images and a database of icons; and wherein each icon has anindex attachment, which opens directly to the file.
 12. An icon creatoraccording to claim 11, wherein a blind person can use a sound icon usingthe database of sound icons; this would enable the blind user to usetheir sense of hearing to choose the file they wish to open.
 13. An iconcreator according to claim 11, further comprising means to allow aperson with a reading disability to use the icon system, including agroup of files that are formed into an icon attachment; the user thenchooses an icon, using the pictures or sounds and the user can then usea speech synthesizer can listen to a file.
 14. An icon creator accordingto claim 11, wherein composite icons contain multiple-topics such ascars and travel, and dealerships, the larger part of the file showscars, the smaller part of the file shows travel; the middle sized partof the file shows dealerships; and further comprising means to containan index which lists information on cars or building, means to show werethe information on cars is placed in the file; and wherein, using afraction method, the files can be broken down.
 15. A method for creatingicons, comprising: generating a list of files; reading the content ofeach file; attaching topics to each file; generating icons for thefiles; if several topics, creating a composite icon containing manytopics; creating an index of topics; printing a list of icons near filenames; and creating a list of icons to list files.
 16. A method ofdetermining and displaying icons representing files containing text, themethod comprising the steps of: determining the content of a file byexamining words in the file; searching a database of icons; on the basisof the determined content of the file, selecting one of the icons in thedatabase to represent the file; and displaying the selected icon torepresent the file.
 17. A method according to claim 16, wherein in thedatabase, each icon is associated with words, and wherein: thedetermining step includes the sep of using a semantic content extractorto identify the importance and significance of topics associated withthe file; and the selecting step includes the step of comparing saidtopics with the words in the database to select one of the icons torepresent the file.
 18. A program storage device readable by machine,tangibly embodying a program of instructions executable by the machineto perform method steps for determining and displaying iconsrepresenting files containing text, said method steps comprising:determining the content of a file by examining words in the file;searching a database of icons; on the basis of the determined content ofthe file, selecting one of the icons in the database to represent thefile; and displaying the selected icon to represent the file.
 19. Aprogram storage device according to claim 18, wherein in the database,each icon is associated with words, and wherein: the determining stepincludes the sep of using a semantic content extractor to identify theimportance and significance of topics associated with the file; and theselecting step includes the step of comparing said topics with the wordsin the database to select one of the icons to represent the file. 20.The system in claim 1, where the icons contain advertisements, which maybe hyperlinks.
 21. The system in claim 20 where users pay less for thesystem if ads are included.
 22. The system in claim 20 where advertiserpays manufacturer or seller of system.